[Issue #3527] Modify the logic around the opportunity change tracking table to never delete records #3565

mikehgrantsgov · 2025-01-17T16:22:48Z

Summary

Time to review: 30 mins

Changes proposed

Add a new table to track task progress
Rename queue table and add all opportunities to it. Add any new opportunities that come in (should basically always be 1:1)
For index job, load data based on last_loaded_at date rather than using has_update field or deleting records as we go.

Context for reviewers

Cascade deleting orphan relationship remains unchanged
JobTable is meant to be general purpose and can be used, possibly ended, by other tasks.

Additional information

See unit tests

api/src/db/models/task_models.py

api/src/task/task.py

api/src/db/models/task_models.py

chouinar · 2025-01-17T17:05:59Z

api/src/db/models/task_models.py

+
+
+class JobTable(ApiSchemaTable, TimestampMixin):
+    __tablename__ = "job"


Let's have a slightly longer name, what about task_log or something like that?

Thoughts on this? job_log might also be good?

Also want to avoid naming the table XTable

… of https://github.com/HHS/simpler-grants-gov into mikehgrantsgov/3527-modify-load-opp-logic-never-delete

…ver-delete

chouinar · 2025-01-21T19:06:41Z

api/src/task/task.py

        except Exception:
+            # Update job status to failed
+            self.update_job(JobStatus.FAILED, metrics=self.metrics)


We might want some tests to verify this works in a few scenarios to make sure the DB session works, two I can think of that might break here:

An error occurs (unrelated to the DB), the DB session needs to be rolled back first

An error occurs when doing a DB operation - I think the DB session does get rolled back, but it might be in an unusable state.

@chouinar I think the recent adds to test_task.py cover the scenarios you mention here. If a db error is thrown in task it will rollback the transaction (which I don't think SQLAlchemy does automatically?) and begin a new transaction so the FAILED state can be stored for the job.

SQLAlchemy doesn't rollback automatically, but if it fails to get to the end of the transaction, it should at least fail to commit.

api/src/task/task.py

api/src/search/backend/load_opportunities_to_index.py

api/src/task/task.py

Co-authored-by: Michael Chouinard <[email protected]>

… of https://github.com/HHS/simpler-grants-gov into mikehgrantsgov/3527-modify-load-opp-logic-never-delete

…ver-delete

chouinar · 2025-01-23T14:10:28Z

api/src/db/models/task_models.py

+
+
+class JobTable(ApiSchemaTable, TimestampMixin):
+    __tablename__ = "job"


Thoughts on this? job_log might also be good?

Also want to avoid naming the table XTable

api/src/db/models/opportunity_models.py

api/tests/src/search/backend/test_load_opportunities_to_index.py

api/src/db/models/task_models.py

…ver-delete

… of https://github.com/HHS/simpler-grants-gov into mikehgrantsgov/3527-modify-load-opp-logic-never-delete

chouinar · 2025-01-24T11:03:02Z

api/src/task/task.py

+            except SQLAlchemyError:
+                # Rollback and begin new transaction only for DB-specific errors
+                self.db_session.rollback()
+                self.db_session.begin()
+                raise
+            except Exception:
+                # For non-DB errors, just raise without touching the transaction
+                raise


In what case would we want a non-DB error to still commit changes? If the task is failing, I'd rather we just rollback anyways and not commit. A partial success state seems more problematic, and if any such scenarios do exist, we should handle that on a particular task.

What about a pattern like this:

job_succeeded = True try: self.run_task() ... # timing stuff except Exception: self.db_session.rollback() # May or may not be necessary? job_succeeded = False raise finally: job_status = JobStatus.COMPLETED if job_succeeded else JobStatus.FAILED with db_session.begin(): self.update_job(job_status, metrics=self.metrics)

Would this hit any issues?

If run_task raises an exception, the timing stuff and metrics would not run. Not sure if that's a big issue, though. I think the rollback() is not necessary here since the task would handle its own tx stuff (right? we are not relying on this task class for tx management?), and the finally block would start its new transaction with db_session.begin.

Otherwise I don't see any other issues here.

I don't think this is quite right yet, but @chouinar if you have any ideas on this, maybe we can pair to make this implementation more elegant.

chouinar · 2025-01-24T11:04:23Z

api/src/task/task.py

        except Exception:
+            # Update job status to failed
+            self.update_job(JobStatus.FAILED, metrics=self.metrics)


SQLAlchemy doesn't rollback automatically, but if it fails to get to the end of the transaction, it should at least fail to commit.

chouinar

Just two small things, sorry for the slow reviews on this PR

chouinar · 2025-01-27T14:24:27Z

api/src/task/task.py

+            logger.info("Starting %s", self.cls_name())
+            start = time.perf_counter()
+


Why did this get moved after the run_task command? Should be the first thing in the try block

Ah was moving things around testing something out. Moved it back.

chouinar · 2025-01-27T14:24:30Z

api/src/search/backend/load_opportunities_to_index.py

+            self.db_session.execute(
+                update(OpportunityChangeAudit)
+                .where(OpportunityChangeAudit.opportunity_id.in_(processed_opportunity_ids))
+                .values(updated_at=get_now_us_eastern_datetime())


In the DB, we use UTC for everything, I would make this call the UTC function, not the eastern one

Using the dateutil here now, thanks

mikehgrantsgov and others added 2 commits January 17, 2025 11:19

Add JobTable, track in tasks

ae394d2

Create ERD diagram and Update OpenAPI spec

ea3d22f

chouinar reviewed Jan 17, 2025

View reviewed changes

mikehgrantsgov added 4 commits January 17, 2025 17:01

Update to enums / add metrics to table

98257e3

Merge branch 'mikehgrantsgov/3527-modify-load-opp-logic-never-delete'…

8fa48f5

… of https://github.com/HHS/simpler-grants-gov into mikehgrantsgov/3527-modify-load-opp-logic-never-delete

Lint

71a80d9

Lint

e65f8d7

mikehgrantsgov requested a review from chouinar January 17, 2025 22:04

mikehgrantsgov marked this pull request as ready for review January 17, 2025 22:04

Merge branch 'main' into mikehgrantsgov/3527-modify-load-opp-logic-ne…

53d7b2e

…ver-delete

mikehgrantsgov requested review from acouch, andycochran, btabaska, coilysiren, doug-s-nava, mdragon and widal001 as code owners January 17, 2025 22:04

Create ERD diagram and Update OpenAPI spec

ac89d27

chouinar reviewed Jan 21, 2025

View reviewed changes

mikehgrantsgov and others added 11 commits January 21, 2025 15:30

Update api/src/task/task.py

4193c2b

Co-authored-by: Michael Chouinard <[email protected]>

Update api/src/task/task.py

4f72d43

Co-authored-by: Michael Chouinard <[email protected]>

Update api/src/task/task.py

238e8fd

Co-authored-by: Michael Chouinard <[email protected]>

Remove last_loaded_at and use updated_at instead

43f90be

Merge branch 'mikehgrantsgov/3527-modify-load-opp-logic-never-delete'…

5c18c81

… of https://github.com/HHS/simpler-grants-gov into mikehgrantsgov/3527-modify-load-opp-logic-never-delete

Create ERD diagram and Update OpenAPI spec

2865fbd

Lint

b5c8d05

Merge branch 'mikehgrantsgov/3527-modify-load-opp-logic-never-delete'…

68030d4

… of https://github.com/HHS/simpler-grants-gov into mikehgrantsgov/3527-modify-load-opp-logic-never-delete

Fix

54b5d10

Merge branch 'main' into mikehgrantsgov/3527-modify-load-opp-logic-ne…

55922b8

…ver-delete

Update migration

92b21fb

Fix migration

4fe3095

mikehgrantsgov requested a review from chouinar January 21, 2025 22:09

chouinar reviewed Jan 23, 2025

View reviewed changes

mdragon reviewed Jan 23, 2025

View reviewed changes

api/src/db/models/task_models.py Outdated Show resolved Hide resolved

mikehgrantsgov added 5 commits January 23, 2025 12:13

Update to JobLog, remove has_update

9efeb56

Format

578fa09

Remove query on non-null column

e62f4e6

Add task tests

ad38093

Catch db errors and rollback/start new transaction to store failed state

562c2a7

mikehgrantsgov requested a review from chouinar January 23, 2025 17:41

mikehgrantsgov and others added 5 commits January 23, 2025 17:16

Merge branch 'main' into mikehgrantsgov/3527-modify-load-opp-logic-ne…

624bd98

…ver-delete

Fix head

e57fc43

Create ERD diagram and Update OpenAPI spec

aa65d75

Fix migration

4342630

Merge branch 'mikehgrantsgov/3527-modify-load-opp-logic-never-delete'…

9d51544

… of https://github.com/HHS/simpler-grants-gov into mikehgrantsgov/3527-modify-load-opp-logic-never-delete

chouinar reviewed Jan 24, 2025

View reviewed changes

Update transaction management

aa6f5b7

mikehgrantsgov requested a review from chouinar January 24, 2025 16:26

mikehgrantsgov added 2 commits January 24, 2025 11:36

Fix test

2761cdf

Wrap failed update with db_session.begin

6c26a5d

chouinar reviewed Jan 27, 2025

View reviewed changes

Use utcnow and now eastern, re-move timing block

562daf3

mikehgrantsgov requested a review from chouinar January 27, 2025 14:28

chouinar approved these changes Jan 27, 2025

View reviewed changes

mikehgrantsgov merged commit 029ea8e into main Jan 27, 2025
2 checks passed

mikehgrantsgov deleted the mikehgrantsgov/3527-modify-load-opp-logic-never-delete branch January 27, 2025 16:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Issue #3527] Modify the logic around the opportunity change tracking table to never delete records #3565

[Issue #3527] Modify the logic around the opportunity change tracking table to never delete records #3565

mikehgrantsgov commented Jan 17, 2025

chouinar Jan 17, 2025

chouinar Jan 23, 2025

chouinar Jan 21, 2025

mikehgrantsgov Jan 23, 2025

chouinar Jan 24, 2025

chouinar Jan 23, 2025

chouinar Jan 24, 2025

mikehgrantsgov Jan 24, 2025

mikehgrantsgov Jan 24, 2025

chouinar Jan 24, 2025

chouinar left a comment

chouinar Jan 27, 2025

mikehgrantsgov Jan 27, 2025

chouinar Jan 27, 2025

mikehgrantsgov Jan 27, 2025



		class JobTable(ApiSchemaTable, TimestampMixin):
		__tablename__ = "job"

		logger.info("Starting %s", self.cls_name())
		start = time.perf_counter()

[Issue #3527] Modify the logic around the opportunity change tracking table to never delete records #3565

[Issue #3527] Modify the logic around the opportunity change tracking table to never delete records #3565

Conversation

mikehgrantsgov commented Jan 17, 2025

Summary

Time to review: 30 mins

Changes proposed

Context for reviewers

Additional information

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chouinar left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment